Bipartite Ranking through Minimization of Univariate Loss
نویسندگان
چکیده
Minimization of the rank loss or, equivalently, maximization of the AUC in bipartite ranking calls for minimizing the number of disagreements between pairs of instances. Since the complexity of this problem is inherently quadratic in the number of training examples, it is tempting to ask how much is actually lost by minimizing a simple univariate loss function, as done by standard classification methods, as a surrogate. In this paper, we first note that minimization of 0/1 loss is not an option, as it may yield an arbitrarily high rank loss. We show, however, that better results can be achieved by means of a weighted (cost-sensitive) version of 0/1 loss. Yet, the real gain is obtained through marginbased loss functions, for which we are able to derive proper bounds, not only for rank risk but, more importantly, also for rank regret. The paper is completed with an experimental study in which we address specific questions raised by our theoretical analysis.
منابع مشابه
On Theoretically Optimal Ranking Functions in Bipartite Ranking
This paper investigates the theoretical relation between loss criteria and the optimal ranking functions driven by the criteria in bipartite ranking. In particular, the relation between AUC maximization and minimization of ranking risk under a convex loss is examined. We characterize general conditions for ranking-calibrated loss functions in a pairwise approach, and show that the best ranking ...
متن کاملConsistent Multilabel Ranking through Univariate Losses
We consider the problem of rank loss minimization in the setting of multilabel classification, which is usually tackled by means of convex surrogate losses defined on pairs of labels. Very recently, this approach was put into question by a negative result showing that commonly used pairwise surrogate losses, such as exponential and logistic losses, are inconsistent. In this paper, we show a pos...
متن کاملAnomaly Ranking as Supervised Bipartite Ranking
The Mass Volume (MV) curve is a visual tool to evaluate the performance of a scoring function with regard to its capacity to rank data in the same order as the underlying density function. Anomaly ranking refers to the unsupervised learning task which consists in building a scoring function, based on unlabeled data, with a MV curve as low as possible at any point. In this paper, it is proved th...
متن کاملBipartite ranking algorithm for classification and survival analysis
Unsupervised aggregation of independently built univariate predictors is explored as an alternative regularization approach for noisy, sparse datasets. Bipartite ranking algorithm Smooth Rank implementing this approach is introduced. The advantages of this algorithm are demonstrated on two types of problems. First, Smooth Rank is applied to two-class problems from bio-medical field, where ranki...
متن کاملRanking - convex risk minimization
The problem of ranking (rank regression) has become popular in the machine learning community. This theory relates to problems, in which one has to predict (guess) the order between objects on the basis of vectors describing their observed features. In many ranking algorithms a convex loss function is used instead of the 0−1 loss. It makes these procedures computationally efficient. Hence, conv...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011